Skip to content

Conversation

realAsma
Copy link
Contributor

@realAsma realAsma commented Sep 15, 2025

What does this PR do?

Type of change: ? code clean up

Overview: ?

Nemo dockers have modelopt version as 0.0.0 - This causes _amax not found warnings to be raised (which is needed only for ModelOpt <0.29) for ModelOpt >= 0.29. This warnings are causing confusion during ModelOpt restore.

We no longer need this check and the version specific checkpointing. The version specific checkpointing have been removed for MCore for sometime now (probably 0.29/0.31).

This PR removes the unused code for version specific checkpointing.

Testing

unittests.

Before your PR is "Ready for review"

  • Make sure you read and follow Contributor guidelines and your commits are signed.
  • Is this change backward compatible?: Yes (the utilities we remove here are not used anymore).
  • Did you write any new necessary tests?: NA
  • Did you add or update any necessary documentation?: NA
  • Did you update Changelog?: NA

Summary by CodeRabbit

  • Refactor
    • Removed legacy checkpoint versioning across quantization components; state loading is now unified without special handling for older checkpoints. Deprecated version-related properties and helpers were eliminated. Older checkpoints may require manual migration.
  • Tests
    • Simplified save/restore tests by removing version-parameterized paths and legacy compatibility formatting.
  • Chores
    • Cleaned up obsolete version-comparison utilities and related imports to reduce maintenance overhead.

@realAsma realAsma requested review from a team as code owners September 15, 2025 20:26
Copy link

coderabbitai bot commented Sep 15, 2025

Walkthrough

Removed ModelOpt checkpoint version handling across quantization modules, plugins, and tests. Eliminated mopt_ckpt_versn properties and related propagation, version checks, and legacy state-dict load logic. Simplified replacement and restore paths. Deleted versioned checkpoint utilities and tests; test flows now save/restore without version formatting.

Changes

Cohort / File(s) Summary of Changes
Core: removal of version API on QuantModule and conversion assignment
modelopt/torch/quantization/nn/modules/quant_module.py, modelopt/torch/quantization/conversion.py
Deleted QuantModule.mopt_ckpt_versn getter/setter; removed assignment of mopt_ckpt_versn during module replacement. No signature changes.
TensorQuantizer: version properties and legacy load path removed
modelopt/torch/quantization/nn/modules/tensor_quantizer.py
Removed mopt_ckpt_versn property and setter; deleted custom _load_from_state_dict legacy handling and related imports; base class loading now used.
Plugins: version checks/API removed
modelopt/torch/quantization/plugins/custom.py, modelopt/torch/quantization/plugins/vllm.py
Deleted version comparison helper in custom plugin; removed mopt_ckpt_versn getter/setter in VLLM fused MoE; dropped related imports.
Tests: drop versioned checkpoint utils and paths
tests/_test_utils/torch_quantization/checkpointing.py, tests/_test_utils/torch_quantization/quantize_common.py, tests/gpu/torch/quantization/test_quantize_cuda.py
Removed checkpoint formatting utility and its usage; save/restore no longer depends on version; deleted parameterized versioned CUDA test.

Sequence Diagram(s)

sequenceDiagram
  participant User as Caller
  participant QM as Quantized Model
  participant TQ as TensorQuantizer
  participant PT as PyTorch Loader

  Note over User,PT: Previous (legacy) restore flow
  User->>PT: load_state_dict(checkpoint)
  activate PT
  PT->>TQ: _load_from_state_dict(state, prefix,...)
  Note right of TQ: Handle legacy ModelOpt versions\nmigrate buffers, set mopt_ckpt_versn,\nissue warnings
  TQ-->>PT: normalized state loaded
  deactivate PT
  PT-->>User: load complete
Loading
sequenceDiagram
  participant User as Caller
  participant QM as Quantized Model
  participant PT as PyTorch Loader

  Note over User,PT: New restore flow
  User->>PT: load_state_dict(checkpoint)
  activate PT
  PT->>QM: default module load path
  Note right of QM: No special ModelOpt version handling\n(no mopt_ckpt_versn propagation)
  PT-->>User: load complete
  deactivate PT
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

I buried the versions in a tidy burrow,
Sleek codepaths trimmed for a faster tomorrow.
No more carrots labeled “0.28” to chew—
Just clean hops through load and quant, swift and new.
Thump-thump! My paws applaud this lighter trail.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title concisely and accurately summarizes the PR’s primary change: removal of legacy checkpointing utilities for MCore/ModelOpt versions older than 0.29 that are no longer used. It is specific, clear, and directly reflects the deletions shown in the changeset so a reviewer scanning history will understand the main intent.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch asma/remove_tq_unnecessary_warnings

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 76e8ce2 and eba1dae.

📒 Files selected for processing (8)
  • modelopt/torch/quantization/conversion.py (0 hunks)
  • modelopt/torch/quantization/nn/modules/quant_module.py (0 hunks)
  • modelopt/torch/quantization/nn/modules/tensor_quantizer.py (0 hunks)
  • modelopt/torch/quantization/plugins/custom.py (0 hunks)
  • modelopt/torch/quantization/plugins/vllm.py (0 hunks)
  • tests/_test_utils/torch_quantization/checkpointing.py (0 hunks)
  • tests/_test_utils/torch_quantization/quantize_common.py (0 hunks)
  • tests/gpu/torch/quantization/test_quantize_cuda.py (0 hunks)
💤 Files with no reviewable changes (8)
  • modelopt/torch/quantization/nn/modules/quant_module.py
  • modelopt/torch/quantization/plugins/vllm.py
  • modelopt/torch/quantization/nn/modules/tensor_quantizer.py
  • tests/_test_utils/torch_quantization/checkpointing.py
  • modelopt/torch/quantization/plugins/custom.py
  • tests/_test_utils/torch_quantization/quantize_common.py
  • tests/gpu/torch/quantization/test_quantize_cuda.py
  • modelopt/torch/quantization/conversion.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: linux
  • GitHub Check: wait-checks / wait
  • GitHub Check: code-quality

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@realAsma realAsma force-pushed the asma/remove_tq_unnecessary_warnings branch from 21c5b22 to eba1dae Compare September 15, 2025 20:27
Copy link

codecov bot commented Sep 15, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.86%. Comparing base (76e8ce2) to head (eba1dae).
⚠️ Report is 3 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #322      +/-   ##
==========================================
- Coverage   73.88%   73.86%   -0.03%     
==========================================
  Files         172      172              
  Lines       17444    17410      -34     
==========================================
- Hits        12889    12860      -29     
+ Misses       4555     4550       -5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ChenhanYu
Copy link
Collaborator

Copy link
Collaborator

@ChenhanYu ChenhanYu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The internal CI went through. Approved

Copy link
Collaborator

@ChenhanYu ChenhanYu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Went trough the utils part only.

@realAsma realAsma merged commit 8a5736a into main Sep 16, 2025
22 checks passed
@realAsma realAsma deleted the asma/remove_tq_unnecessary_warnings branch September 16, 2025 19:45
yeyu-nvidia pushed a commit that referenced this pull request Sep 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants